# VQA Task Optimization
Vilt Finetuned 100
Apache-2.0
A vision-language model fine-tuned on VQA datasets based on the ViLT-B32-MLM model
Text-to-Image
Transformers

V
bangbrecho
15
0
Vilt Finetuned 200
Apache-2.0
This model is a vision-language model based on the ViLT architecture, fine-tuned on VQA datasets, suitable for visual question answering tasks.
Text-to-Image
Transformers

V
MariaK
84
0
Featured Recommended AI Models